SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "db:Swepub ;pers:(Jantsch Axel);srt2:(2015-2019)"

Sökning: db:Swepub > Jantsch Axel > (2015-2019)

  • Resultat 1-10 av 22
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Feng, Chaochao, et al. (författare)
  • Performance analysis of on-chip bufferless router with multi-ejection ports
  • 2015
  • Ingår i: Proceedings - 2015 IEEE 11th International Conference on ASIC, ASICON 2015. - : IEEE conference proceedings. - 9781479984831
  • Konferensbidrag (refereegranskat)abstract
    • In general, the bufferless NoC router has only one local output port for ejection, which may lead to multiple arriving flits competing for the only one output port. In this paper, we propose a reconfigurable bufferless router in which the number of ejection ports can be configured as 2, 3 and 4. Simulation results demonstrate that the average packet latency of the routers with multi-ejection ports is 18%, 10%, 6%, 14%, 9% and 7% on average less than that of the router with 1 ejection ports under six synthetic workloads respectively. For application workloads, the average packet latency of the router with more than two ejection ports is slightly better than the router with only one ejection port, which can be neglect. Making a compromise of hardware cost and performance, it can be concluded that it is no need to implement bufferless routers with 3 and 4 ejection ports, as the router with 2 ejection ports can achieve almost the same performance as the routers with 3 and 4 ejection ports.
  •  
2.
  •  
3.
  • Haghbayan, M. -H, et al. (författare)
  • MapPro : Proactive runtime mapping for dynamic workloads by quantifying ripple effect of applications on networks-on-chip
  • 2015
  • Ingår i: Proceedings - 2015 9th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2015. - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450333962
  • Konferensbidrag (refereegranskat)abstract
    • Increasing dynamic workloads running on NoC-based many-core systems necessitates efficient runtime mapping strategies. With an unpredictable nature of application profiles, selecting a rational region to map an incoming application is an NP-hard problem in view of minimizing congestion and maximizing performance. In this paper, we propose a proactive region selection strategy which prioritizes nodes that offer lower congestion and dispersion. Our proposed strategy, MapPro, quantitatively represents the propagated impact of spatial availability and dispersion on the network with every new mapped application. This allows us to identify a suitable region to accommodate an incoming application that results in minimal congestion and dispersion. We cluster the network into squares of different radii to suit applications of different sizes and proactively select a suitable square for a new application, eliminating the overhead caused with typical reactive mapping approaches. We evaluated our proposed strategy over different traffic patterns and observed gains of up to 41% in energy efficiency, 28% in congestion and 21% dispersion when compared to the state-of-the-art region selection methods. Copyright 2015 ACM.
  •  
4.
  •  
5.
  • Huang, Letian, et al. (författare)
  • Non-Blocking Testing for Network-on-Chip
  • 2016
  • Ingår i: IEEE Transactions on Computers. - : IEEE. - 0018-9340 .- 1557-9956. ; 65:3, s. 679-692
  • Tidskriftsartikel (refereegranskat)abstract
    • To achieve high reliability in on-chip networks, it is necessary to test the network as frequently as possible to detect physical failures before they lead to system-level failures. A main obstacle is that the circuit under test has to be isolated, resulting in network cuts and packet blockage which limit the testing frequency. To address this issue, we propose a comprehensive network-level approach which could test multiple routers simultaneously at high speed without blocking or dropping packets. We first introduce a reconfigurable router architecture allowing the cores to keep their connections with the network while the routers are under test. A deadlock-free and highly adaptive routing algorithm is proposed to support reconfigurations for testing. In addition, a testing sequence is defined to allow testing multiple routers to avoid dropping of packets. A procedure is proposed to control the behavior of the affected packets during the transition of a router from the normal to the testing mode and vice versa. This approach neither interrupts the execution of applications nor has a significant impact on the execution time. Experiments with the PARSEC benchmarks on an 8x8 NoC-based chip multiprocessors show only 3 percent execution time increase with four routers simultaneously under test.
  •  
6.
  • Jafari, Fahimeh, et al. (författare)
  • Least Upper Delay Bound for VBR Flows in Networks-on-Chip with Virtual Channels
  • 2015
  • Ingår i: ACM Transactions on Design Automation of Electronic Systems. - : Association for Computing Machinery (ACM). - 1084-4309 .- 1557-7309. ; 20:3
  • Tidskriftsartikel (refereegranskat)abstract
    • Real-time applications such as multimedia and gaming require stringent performance guarantees, usually enforced by a tight upper bound on the maximum end-to-end delay. For FIFO multiplexed on-chip packet switched networks we consider worst-case delay bounds for Variable Bit-Rate (VBR) flows with aggregate scheduling, which schedules multiple flows as an aggregate flow. VBR Flows are characterized by a maximum transfer size (L), peak rate (p), burstiness (sigma), and average sustainable rate (rho). Based on network calculus, we present and prove theorems to derive per-flow end-to-end Equivalent Service Curves (ESC), which are in turn used for computing Least Upper Delay Bounds (LUDBs) of individual flows. In a realistic case study we find that the end-to-end delay bound is up to 46.9% more accurate than the case without considering the traffic peak behavior. Likewise, results also show similar improvements for synthetic traffic patterns. The proposed methodology is implemented in C++ and has low run-time complexity, enabling quick evaluation for large and complex SoCs.
  •  
7.
  • Jafari, Fahimeh, et al. (författare)
  • Weighted Round Robin Configuration for Worst-Case Delay Optimization in Network-on-Chip
  • 2016
  • Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - : IEEE. - 1063-8210 .- 1557-9999. ; 24:12, s. 3387-3400
  • Tidskriftsartikel (refereegranskat)abstract
    • We propose an approach for computing the end-to-end delay bound of individual variable bit-rate flows in an First Input First Output multiplexer with aggregate scheduling under weighted round robin (WRR) policy. To this end, we use a network calculus to derive per-flow end-to-end equivalent service curves employed for computing least upper delay bounds (LUDBs) of the individual flows. Since the real-time applications are going to meet guaranteed services with lower delay bounds, we optimize the weights in WRR policy to minimize the LUDBs while satisfying the performance constraints. We formulate two constrained delay optimization problems, namely, minimize-delay and multiobjective optimization. Multiobjective optimization has both the total delay bounds and their variance as the minimization objectives. The proposed optimizations are solved using a genetic algorithm. A video object plane decoder case study exhibits a 15.4% reduction of the total worst case delays and a 40.3% reduction on the variance of delays when compared with round robin policy. The optimization algorithm has low run-time complexity, enabling quick exploration of the large design spaces. We conclude that an appropriate weight allocation can be a valuable instrument for the delay optimization in on-chip network designs.
  •  
8.
  •  
9.
  • Jiang, Ke (författare)
  • Security-Driven Design of Real-Time Embedded Systems
  • 2015
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Real-time embedded systems (RTESs) have been widely used in modern society. And it is also very common to find them in safety and security critical applications, such as transportation and medical equipment. There are, usually, several constraints imposed on a RTES, for example, timing, resource, energy, and performance, which must be satisfied simultaneously. This makes the design of such systems a difficult problem.More recently, the security of RTESs emerges as a major design concern, as more and more attacks have been reported. However, RTES security, as a parameter to be considered during the design process, has been overlooked in the past. This thesis approaches the design of secure RTESs focusing on aspects that are particularly important in the context of RTES, such as communication confidentiality and side-channel attack resistance.Several techniques are presented in this thesis for designing secure RTESs, including hardware/software co-design techniques for communication confidentiality on distributed platforms, a global framework for secure multi-mode real-time systems, and a scheduling policy for thwarting differential power analysis attacks. All the proposed solutions have been extensively evaluated in a large amount of experiments, including two real-life case studies, which demonstrate the efficiency of the presented techniques.
  •  
10.
  • Kanduri, Anil, et al. (författare)
  • Accuracy-Aware Power Management for Many-Core Systems Running Error-Resilient Applications
  • 2017
  • Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1063-8210 .- 1557-9999. ; 25:10, s. 2749-2762
  • Tidskriftsartikel (refereegranskat)abstract
    • Power capping techniques based on dynamic voltage and frequency scaling (DVFS) and power gating (PG) are oriented toward power actuation, compromising on performance and energy. Inherent error resilience of emerging application domains, such as Internet-of-Things (IoT) and machine learning, provides opportunities for energy and performance gains. Leveraging accuracy-performance tradeoffs in such applications, we propose approximation (APPX) as another knob for close-looped power management, to complement power knobs with performance and energy gains. We design a power management framework, APPEND+, that can switch between accurate and approximate modes of execution subject to system throughput requirements. APPEND+ considers the sensitivity of the application to error to make disciplined alteration between levels of APPX such that performance is maximized while error is minimized. We implement a power management scheme that uses APPX, DVFS, and PG knobs hierarchically. We evaluated our proposed approach over machine learning and signal processing applications along with two case studies on IoT-early warning score system and fall detection. APPEND+ yields 1.9x higher throughput, improved latency up to five times, better performance per energy, and dark silicon mitigation compared with the state-of-the-art power management techniques over a set of applications ranging from high to no error resilience.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 22

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy